Digital VCR with trick play steam derivation

ABSTRACT

A consumer digital video cassette recorder may record an advanced television signal having an MPEG like signal format. The predictive nature of the MPEG like signal format requires that additional I frame data be generated and recorded together with a normal a play speed data stream to facilitate non-standard speed, or trick play reproduction. Additional I frame data streams are generated specifically for each replay speed and are written within recorded tracks to facilitate reproduction at the predetermined speeds. Various inventive methods for the derivation of full resolution and reduced resolution trick play data streams are disclosed. Inventive trick play data stream generation is disclosed for real time recording by consumer apparatus and non-real time normal and trick play data stream generation for use with pre-recorded digital media.

This invention relates to the field of digital video recording, and inparticular to the derivation, recording and reproduction of MPEG likeadvanced television signals at non-standard speeds.

BACKGROUND OF THE INVENTION

A digital video cassette recorder employing a helical scanning formathas been proposed by a standardization committee. The proposed standardspecifies digital recording of standard definition SD televisionsignals, for example, NTSC or PAL, and high definition televisionsignals having an MPEG compatible structure, such as a proposed GrandAlliance or GA signal. The SD recorder utilizes a compressed componentvideo signal format employing intra field/frame DCT with adaptivequantization and variable length coding. The SD digital VCR or DVCR maydigitally record either NTSC or PAL television signals and hassufficient data recording capability to record an advanced televisionsignal.

A specification of the GA signal is included in a draft specificationdocument titled Grand Alliance HDTV System Specification, published inthe 1994 Proceeding of the 48th Annual Broadcast Engineering ConferenceProceedings, Mar. 20-24, 1994. The GA signal employs an MPEG compatiblecoding method which utilizes an intra-frame coded picture, termed Iframe, a forward predicted frame, termed a P frame and a bidirectionallypredicted frame, termed a B frame. These three types of frames occur ingroups known as GOPs or Groups Of Pictures. The number of frames in aGOP is user definable but may comprise, for example, 15 frames. Each GOPcontains one I frame, which may be abutted by two B frames, which arefollowed by a P frame.

In an analog consumer VCR, "Trick Play" or TP features such as picturein forward or reverse shuttle, fast or slow motion, are readilyachievable, since each recorded track typically contains one televisionfield. Hence, reproduction at speeds other than standard, may result inthe reproducing head, or heads, crossing multiple tracks and recoveringrecognizable picture segments. The picture segments may be abutted andprovide a recognizable and useful image. An advanced television or MPEGlike signal may comprise groups of pictures or GOPs. The GOP may, forexample, comprise 15 frames and each frame may be recorded occupyingmultiple tracks on tape. For example, if 10 tracks are allocated to eachframe, then a 15 frame GOP will comprise 150 tracks. During play speedoperation, I frame data is recovered which enables the decoding andreconstruction of the predicted P and B frames. However, when a DVCR isoperated at a non-standard reproduction speed, the replay headstransduce sections or segments from the multiple tracks. Unfortunatelythese DVCR tracks no longer represent discrete records of consecutiveimage fields. Instead these segments contain data resulting mainly frompredicted frames. However, since predicted P and B frames requirepreceding data to facilitate decoding the possibility of reconstructingany usable frames from the reproduced pieces of data is greatlydiminished. In addition the MPEG data stream is particularly unforgivingof missing or garbled data. Thus, to provide "Trick Play" ornon-standard speed replay features requires that specific data berecorded, which when reproduced in a TP mode, is capable of imagereconstruction without the use of adjacent or preceding frameinformation. The specific data, or "Trick Play" data must besemantically correct to allow MPEG decoding. In addition, a selection of"Trick Play" speeds, may require different TP data derivation and mayrequire TP speed specific recorded track locations.

To be capable of reconstruction without preceding frame data requiresthat "Trick Play" specific data be derived from I frames. The "TrickPlay" specific data must be syntactically and semantically correct toallow decoding, for example, by a GA or MPEG compatible decoder. Inaddition the "Trick Play" or TP data must be inserted into the MPEG likedata stream for recording together with the normal play, MPEG likesignal. This sharing of the recording channel data capacity may imposeconstraints in terms of TP data bit rate which may be provided withinthe available track capacity. The TP data bit rate may be variouslyutilized or shared between spatial and or temporal resolution in thederived or reconstructed TP image.

Reproduced "Trick Play" image quality may be determined by thecomplexity of the TP data derivation. For example, a consumer DVCR mustderive TP data during recording, essentially in real-time and with onlynominal additional data processing expense added to the DVCR cost. Thusreal-time consumer DVCR "Trick Play" image quality may appear inferiorto TP image data derived by non-real time image processing utilizingsophisticated digital image processing. With non-real time TP imageprocessing for example, an edited program may be processed, possibly ona scene by scene basis, possibly at non-real-time reproduction speeds,to enable the use of sophisticated digital image processing techniques.Such non-real time processing may inherently provide higher quality"Trick Play" images than that attainable with real time processing.

SUMMARY OF THE INVENTION

A method for generating in real-time an MPEG compatible digital imagerepresentative signal for recording to facilitate reproduction at morethan one speed. The method comprises the steps of: receiving a datastream comprising an MPEG compatible digital image representativesignal; decoding the data stream to extract intra-coded data; storingpredetermined intra-coded data from the extracted intra-coded data toform intra-coded frames having reduced spatial resolution; periodicallyselecting an intra-coded frame from the stored frames having reducedspatial resolution; sequentially selecting the periodically selectedintra-coded frame and the data stream to form a bit stream; andrecording the bit stream in real time.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a simplified block diagram of an inventive system for thereal-time generation of a "trick-play" data stream having lowresolution.

FIG. 2 shows a simplified block diagram of a further inventive systemfor the real-time generation of a full resolution, "trick-play" datastream.

FIG. 3 shows a simplified block diagram illustrating an inventive methodfor generating low resolution "trick-play" data streams for inclusion inpre-recorded digital records.

FIG. 4 shows a simplified block diagram illustrating a further inventivemethod for generating "trick-play" data streams use for inclusion inpre-recorded digital records.

FIG. 5 illustrates the derivation of predicted macroblock DCcoefficients.

FIG. 6 shows a simplified partial block diagram illustrating a furtherinventive method for non-real-time generation of pre-recorded records.

FIG. 7 shows a simplified partial block diagram illustrating anotherinventive method for non-real-time generation of pre-recorded records.

DETAILED DESCRIPTION

In a consumer digital video cassette recorder major considerations inthe real-time generation of a trick-play stream are the complexity andcost of processing required, and the need to keep this cost at areasonable level. For this reason, the processing utilized in thegeneration of a real-time trick-play data stream may be limited toextracting pieces of the existing bit stream and implementing minormodifications to bit-stream parameters.

"Trick-play" data streams must be produced in real-time by extractingindependent intra-information pieces from the original data stream. Thisintra-information may come from intra-frames, intra-slices, and/orintra-macroblocks. The source selected for I frame data derivationdepends on the form of intra refresh employed in the original stream,and for exemplary purposes it is assumed that either intra-frame orintra-slice refresh method is employed.

In a first inventive method of real-time generation, a low spatialresolution "Trick Play" data stream is derived. The low spatialresolution trick-play stream may, for example, have resolution accordingto the CCIR 601 standard, (720×480 pixels), regardless of the originalHDTV stream resolution. Since the effective available bitrate fortrick-play streams is limited to nominally 2M. bits/sec., employing lowspatial resolution in this manner results in fewer bits being used perframe, and thus a relatively high temporal resolution may be achieved.However, this low spatial resolution may only be practical if anadvanced television decoder and display is capable of such resolution.

In a second inventive method a trick-play stream is generated having thesame resolution, or pixel count, as the original HDTV material. However,since the usable trick-play bit-rate is limited by the recording channelcapacity of nominally 2M. bits/sec., a trade-off exists between spatialand temporal resolution. Thus the provision of a full spatial resolution"Trick Play" mode effectively requires that the temporal resolution bereduced to remain commensurate with the TP data channel capacity.

The first inventive method for real-time generation of a low spatialresolution "Trick Play" data is illustrated in FIG. 1. In this exemplaryblock diagram, trick-play speeds of 5×, 18× and 35×are generated. Foreach TP speed, low-resolution, intra-coded frames are constructed from areceived MPEG like transport stream. By detecting MPEG headerinformation in the transport stream down to the slice level, intraslices can be extracted, processed and used to create a single I-framein memory 110. The extraction and processing stage 100 performs threetasks; extracting macroblocks for he construction of a TP I-frame,re-encoding DC transform coefficients when necessary using DPCMencoding, and discarding unwanted AC transform coefficients whennecessary. Having constructed and stored a low-resolution TP I-frame inmemory 110, it is utilized in the generation of speed specific datastreams for each trick-play speed.

A radio frequency carrier, modulated responsive to an MPEG compatiblesignal, is received by receiver 05. The modulated carrier may be sourcedfrom either an antenna or a cable, not shown. Receiver 05 demodulatesand processes the received carrier to produce an MPEG compatibleadvanced television transport stream 09.

The advanced television transport stream 09, is demultiplexed in block20 to obtain only the Packetized Elemental Stream or PES streamcorresponding to the advanced television video information. The PESstream is decoded in block 30 to extract from the packets, the MPEGencoded video stream payload. Having extracted the MPEG encoded stream,the required intra-coded information may be detected and extracted.Sequence detection block 40 examines the bit stream for the occurrenceof a start code characterized by twenty five 0's followed by 1, followedby an 8 bit address indicating MPEG video header. Picture detection isperformed in block 50 and in block 60 slice layers are detected. Sincean intra coded "trick-play" I frame is to be constructed onlyintra-slices are extracted. Intra-slices contain only intra-codedmacroblocks, and are characterized by a 1-bit intra₋₋ slice flag in theslice header. Thus when the intra₋₋ slice flag is set to 1 the entireslice is passed to the "data extraction and processing" stage 100. Theintra detection process of block 70 assumes that either intra-frame orintra-slice refresh techniques are employed and also that theintra-slice flag in the slice header is set when appropriate. If theintra₋₋ slice flag is not set or intra-macroblock refresh is used then afurther level of detection down to macroblock level is required.

The data extraction and processing stage 100 selects from theintra-coded macroblocks extracted in block 70, only intra informationwhich is utilized for constructing various trick-play data streams. Inaddition block 100 performs any processing which may be necessary toensure the syntactic and semantic correctness for MPEG compatibility ofthe resulting reconstructed TP I-frame. Since the reconstructed TPI-frame is of lower spatial resolution than the original MPEG stream,only a sub-set of the detected intra-macroblocks is required. Todetermine which macroblocks or MBs are to be kept and which are to bediscarded, either a mathematical function or a predefined look-up tablemay be employed. The resulting lower spatial resolution frame resultsfrom the selected patchwork of macroblocks. A controller stage 90 iscoupled to processing stage 100 and provides either, calculationrequired by the mathematical function or provides the look up table fordetermining macroblock selection.

The relationship between the MB position in the new low-resolutionI-frame,

    (mb(i, j), i=0, 1, 2, . . . n-1, j=0, 1, 2, . . . m-1,

where m and n are the new I-frame width and height in MBs respectivelyand i and j refer to the MB row and column) and the originalfull-resolution frame

    ((MB(I, J), I=0, 1, 2, . . . N-1, J=0, 1, 2, . . . M--1,

where M and N are the original frame width and height and I and J arethe MB row and column), the relationship is given by:

    i (low-resolution row)= I.(n-1)/(N-1)!

    j (low-resolution column)= J.(m-1)/(M-1)!

where the product of the square brackets x! denotes the integer valueclosest to x.

The low resolution TP I frame utilizes a subset of the macroblocks fromthe original frame with the remaining non-selected MBs being discarded.FIG. 5 illustrates an exemplary 4:2:0 sampled signal comprising threeintra-coded macroblocks MB1, MB2 and MB3, where each comprises blocks 0,1, 2, 3, 4 and 5. Macroblock 2 is crossed through to illustrate non-usein constructing the reduced resolution TP I frame. The DC coefficientsof each luminance and chrominance block are depicted in FIG. 5 with darkstripes. The DC coefficients are predicted from within each macroblock,with the DC coefficient of the first block of an MB being predicted fromthe last DC coefficient of the immediately preceding MB of the slice.The arrows in FIG. 5 illustrate the prediction sequence. Thus, if thepreceding MB, for example, MB 2 of FIG. 5 is not selected, certain DCcoefficients must be re-calculated from the newly abutted macroblock, asdepicted by arrows NEW of FIG. 5, and re-encoded using DPCM. Thisre-encoding process is performed as the macroblocks are written to theI-frame memory 110.

If the HDTV video sequence originated from an interlaced scanningsource, an optional processing step may be included to remove interlace"flicker" exhibited by frozen interlaced fields containing motion. Ifthe temporal resolution of the reconstructed trick-play stream is suchthat the same frame (two fields) is displayed for more than one frameperiod, then such interlaced "flicker" may be very noticeable. Infield-coded macroblocks this "flicker" artifact may be eliminated bycopying the top two blocks of the macroblock, blocks 0 and 1, to thelower two blocks, blocks 2 and 3. This copying within the macroblockeffectively makes both fields the same thus removing any field-to-fieldmotion from the frame. This re-encoding process is performed as themacroblocks are written to the I-frame memory 110.

A further function performed by processing stage 100 is the removal ofAC coefficients from each macroblock which cannot be accommodated in thenewly constructed TP I-frame due to the low bit-rate available for thetrick-play streams. To accomplish this, each block isvariable-length-decoded to the point where the block will be padded withzeros, indicating the last coefficient of that block. The number of bitsfor each block are stored and accumulate in a buffer. The bits arecounted and when a count exceeds a predetermined number the remaining ACcoefficients are unused or deleted. The number of bits per TP MB dependson the overall rate allowed for each trick-play stream and the temporalresolution or number of frame updates per second.

The block diagram of FIG. 1 illustrates the formation of trick-play datastreams having the same allocated bit-rate. If the rate differssignificantly between TP speeds, for example, to provide differingresolution at each speed, then the number of AC coefficients retained inI-frame memory 110 will also differ for each speed. Hence I-frame memory110 cannot be shared and separate I-frame memories may be required foreach TP speed or bit rate.

The inventive low-resolution TP I-frame assembled in I-frame memory 110is coupled to three trick-play stream generation stages; 5 times, block145; 18 times, block 160 and 35 times block 170. In exemplary FIG. 1,each trick-play stream may be allocated the same bit-rate and temporalresolution, which may represent a preferred configuration. However, notevery reconstructed TP I-frame is used for each TP speed. For example,if the I-frame refresh rate in the original stream is once every fifteenframes (M=15) and the temporal resolution used by each trick-play streamis selected to be three, i.e. the number of frame times between frameupdates, then for 5 times speed;

    (5×speed). (3 frame repeats)/(15 frame refresh)=1.0

thus every TP I-frame will be used. Similarly for 18× and 35× speeds,

    (18).(3)/(15)=3.6

    (35).(3)/(15)=7.0

Thus at 18× speed approximately every third or fourth I-frame is used,and at 35× speed every seventh I-frame is used. If it is assumed thatthe intra-refresh period in an advanced television stream is 0.5 seconds(M=15 for 30 fps source) then a three-frame holding time for 5× speed isthe highest possible TP temporal resolution. For simplicity andconsistency a three-frame holding time may be used for the remaining TPspeeds. A higher temporal resolution of two-frames or single-frameholding time could be used for higher TP speeds since lower temporalresolution at higher speeds may give a false sense of slower than actualtrick-play speed. Assuming that the effective trick-play bit-rate isconstant, the provision of a higher temporal resolution wouldconsequently require a lower spatial resolution quality.

The reconstructed TP I-frame is read from memory 110 and packaged,according to TP speed, by blocks 145, 160 and 170 which add theappropriate MPEG picture headers and a PES layer. The advancedtelevision transport stream 09 is buffered by buffer 15, which generatessignal 10, a transport stream for normal play speed processing. Normalplay transport stream 10 is coupled to multiplexor MUX 150. MultiplexorMUX 150 is controlled responsive to recorder 210 servo signals togenerate an output bit stream having a sequence which when recordedproduces a predetermined track format. The recorded track format isselected to provide the desired recorded TP bit rate and to facilitatespecific physical location of speed specific TP I-frame packets withinspecific recorded tracks. The recorded track format thus facilitatesreplay at normal speed and at the predetermined trick-play speeds. TheTP I-frame packets, 5× signal 121, 18× signal 131 and 35× signal 141,are coupled to multiplexor MUX 150 which inserts the I-frame packets foreach TP speed into the normal play transport stream. Thus a valid, MPEGlike, transport stream is formatted for record processing by recorder210 and recording on tape 220.

To minimize TP bit rate, in place of repeated TP I frames, frame repeatsor holding times, may be implemented by writing empty P-frames between Iframes in the video stream. An empty P-frame results in the decoderpredicting from the previous frame, i.e. the TP I frame. Alternatively,frame repeats may be implemented by setting the DSM₋₋ trick₋₋ mode₋₋flag in the PES layer and calculating the Presentation Time Stamp andDecode Time Stamp PTS/DTS values such that each TP I frame is presentedthe necessary number of frame times apart. Either frame repeat methodproduces the same result. However, the second method requires no extraprocessing of the TP stream on playback and hence, adds no extra cost tothe unit.

However, the second method requires that the optional DSM₋₋ trick₋₋mode₋₋ flag is supported in advanced television decoders. With thissecond method, the extra processing is implemented in the advancedtelevision decoder. Either frame repeat method may implemented duringspeed specific stream generation in blocks 145, 160 and 170.

The inventive trick-play stream generation techniques described abovewere employed to produce trick-play speeds of 5×, 18× and 35× with aspatial resolution of 720×480 pixels, and an effective trick-play datarate of 2.0 Mbps. The various trick-play speeds were evaluated and maybe summarized by the following points:

Data for each trick-play speed was generated representing independentlow-resolution (720×480 pixels), MPEG compatible transport streams.

Each TP stream contains only intra-coded frames thus allowing the sametrick-play stream to be used for both Fast Forward and Fast Reverse TPmodes.

To retain a 16:9 aspect ratio, the actual spatial image size is sampledto 720×384 pixels, with the remaining area above and below the TP imageblack.

The temporal resolution is such that a constant three-frame holding timeis used resulting in an effective rate of 10 frames per second.

Each I frame of the trick-play streams comprises a selection of sampledmacroblocks from the original stream. The bit rate of 2.0M. bits/sec.and three-frame holding time allows most AC coefficients to remain inthe selected macroblocks for typical test material.

The overall subjective spatial resolution is fair, being dependent onthe amount of motion and image complexity in the source material. Apicture rate of 10 fps provides good temporal resolution. The trick-playdata stream may be decoded to produce recognizable trick-play videoimages and hence is acceptable for tape search usage.

The inventive low-resolution real-time trick-play mode previouslydiscussed produces recognizable spatial images at a relatively hightemporal resolution. However, as already mentioned, this mode may beused if an advanced television receiver/decoder unit is operable atlower resolution, for example, such as that produced by CCIRrecommendation 601. However, if operation at a lower resolution is notprovided, then trick-play data must be derived having nominally the samespatial resolution, i.e. the same pixel count as the original source.FIG. 2 illustrates an inventive exemplary system for generatingfull-resolution, real-time trick-play streams. Three trick-play speedsof 5 times, 18 times and 35 times are illustrated. The differencebetween the full-resolution scheme of FIG. 2 and the low-resolutionscheme illustrated in FIG. 1, is in data extraction and processing block105, and stream generation blocks 155, 165 and 175.

The transport stream decoding and intra detection depicted in blocks 20,30, 40, 50, 60, and 70 operate and function as described for the lowresolution TP system of FIG. 1. As described for the low resolution TPsystem, the purpose of the data extraction and processing stage, block105, is to extract only intra information which is required for formingtrick-play streams and to perform any processing which is required toguarantee the syntactic and semantic correctness of the resulting TPI-frame. The functionality of block 105 differs from that of block 100in that the regenerated I-frame must have the same resolution, or pixelcount, as the original data stream. Hence, all intra macroblocks areused to reconstruct the new TP I-frame. Since no MBs are deleted, nore-encoding of DC transform coefficients is required.

The major function of processing block 105 is the removal of ACcoefficients from each macroblock which, as a consequence of thetrick-play bit-rate cannot be accommodated in the new TP I-frame. Thelow TP channel bit-rate, nominally 2M. bit/sec. forces a trade-offbetween the number of AC coefficients used, i.e. spatial resolution, andthe temporal resolution, or frame update rate of the trick-play stream.This spatial versus temporal trade-off was also present in thederivation of the low-resolution stream. However, in a full-resolutionframe, i.e. same pixel count, the DC coefficients alone are likely torepresent more bits than all the coefficients, both AC and DC assembledin a low-resolution TP frame. Thus any limited inclusion of even a fewAC coefficients in each full-resolution macroblock will produce asignificant reduction in the temporal resolution, i.e. the frame updatetime will be lengthened, with more frame repeats. Thus to facilitateconstant temporal resolution in full-resolution trick-play streams, asystem may employ only the DC coefficients of each macroblock with allAC coefficients being discarded. In addition, discarding the ACcoefficients reduces processing complexity since only variable-lengthdecoding of the DPCM value of the DC coefficient is required. FIG. 2illustrates an exemplary system where each trick-play speed has the samebit rate, and thus the same I-frame memory may be shared between thethree TP speeds.

As discussed previously, if the original HDTV video images weregenerated by interlaced scanning, then an optional processing step maybe included to remove interlace "flicker" exhibited by frozen fieldscontaining motion. One such method has already been described. However,since this exemplary high resolution TP system uses only DC transformcoefficients, a simpler and more efficient method may be provided bysetting the frame₋₋ pred₋₋ frame₋₋ dct flag in the picture₋₋ coding₋₋extension section to `1`. This flag indicates that all MBs were frameencoded, thus a previously field-coded block, which could produce`flicker`, is decoded as a frame-coded block. The result is that eachfield is placed in either the upper or lower portion of a block and any`flicker` is removed. This method of flicker elimination also reducesthe number of bits used in the macroblock₋₋ modes section since thedct₋₋ type flag can no longer be present if frame₋₋ pred₋₋ frame₋₋ dct₋₋is set to `1`.

The reconstructed TP I-frame is assembled in memory 115, and coupled tothree trick-play stream generation stages, 5 times speed depicted inblock 155, 18 times speed in block 165 and 35 times speed in block 175.The exemplary system of FIG. 2 assumes that each trick-play stream hasthe same effective bit-rate and hence the same approximate temporalresolution. As discussed previously, not every reconstructed TP I-frameis used for each speed. However TP I-frame utilization may be furtherlimited for the following reason. Although each TP I-frame has the samenumber of coefficients, for example DC only, each TP I-frame may nothave the same number of bits since the DC coefficients are variablelength encoded. Therefore, a constant temporal resolution or frameholding time, cannot be fixed for each trick-play stream. Instead theframe holding time will vary slightly over time with the number of bitsrequired to encode or form each TP I-frame. For each trick-play speed,the respective "stream generation" stages, 155, 165 and 175, wait untilenough bits have been accumulated in buffer 105 to encode a TP I-frame.Then if the TP I-frame accumulated in the buffer at the time is a new TPI-frame, i.e. one which has not yet been encoded in the specifictrick-play speed, the TP I-frame is encoded and the number of bits usedwill be subtracted from those available. If every I-frame was the samesize and each trick play speed was allocated the same effectivebit-rate, this scheme would be equivalent to that described for thelow-resolution system and the frame refresh period would be constant forall speeds. The reconstructed TP I-frames are read from memory 115 andpackaged by stream generators 155, 165 and 175 to form a MPEG compatibletransport streams in exactly the same way as detailed for thelow-resolution system.

The inventive full spatial resolution trick-play stream generationtechnique described above was evaluated at an effective trick-play datarate of 2.0 Mbps, for trick-play speeds of 5×, 18× and 35×. Theperformance may be summarized as follows:

An independent, TP I-frame-only MPEG compatible transport stream may berecorded for each trick-play speed.

The temporal resolution varies with scene complexity and is lower,having longer frame holding times than the low spatial resolutiontrick-play system previously described. The average and the variation inholding times experienced for typical source material are shown in thefollowing table:

    ______________________________________               AVERAGE HOLDING                              VARIATION IN    TP SPEED   TIME IN FRAMES FRAMES    ______________________________________     5X        5 FRAMES       5-8 FRAMES    18X        5 FRAMES       5-8 FRAMES    35X        5 FRAMES       5-8 FRAMES    ______________________________________

Note: Because an identical effective trick-play bit-rate is used for allspeeds, the temporal resolution will always be similar (if notidentical) for each speed.

Each TP I-frame uses only DC coefficients.

The overall quality of spatial resolution is only fair since only DCcoefficients are used. The quality of temporal resolution may varybetween poor and fair, depending on the level of complexity within theTP encoded material. However, the resulting trick-play images arerecognizable and acceptable for tape search usage.

The major differences between real-time trick-play and pre-recordedtrick-play data stream derivation, result from the constraints of costand lack of complexity imposed in a consumer recorder/player. Theconsumer unit must derive and record the trick-play data stream whilerecording normal replay data, i.e. the trick-play data stream is derivedin real-time. With pre-recorded material, trick-play data streams may bederived directly from an original picture source rather than from acompressed MPEG encoded stream. Speed specific TP data streams may bederived independently of one another and independently from the actualrecording event. Thus pre-recorded trick-play data may be derived innon-real time, possibly at non-standard or slower frame repetitionrates. Since the constraints of the consumer real-time method no longerapply, the quality of trick-play reproduction achieved by pre-recordedmaterial may be significantly higher.

A first inventive method of pre-recorded TP data derivation provides aspatial resolution of for example, CCIR Rec. 601 having a resolution of720×480 pixels, regardless of the original HDTV stream resolution. Asecond inventive method constructs a trick-play stream of the sameresolution, i.e. pixel count, as the original HDTV material.

FIG. 3 illustrates an exemplary block diagram showing an inventivemethod for generating low-resolution, pre-recorded trick-play datastreams. Regardless of the format of the original HDTV video material09, temporal processing block 30, performs temporally subsampling whichproduces a 30 Hz, progressive signal 31. The operation of this stage maydiffer depending on whether the original source material is progressivewith a 59.94/60 Hz frame rate or interlaced with a 29.97/30 Hz framerate. With progressively scanned source material, the frame rate may bereduced by dropping every second frame from the sequence. By droppingalternate frames a progressive sequence results having half the temporalresolution of the original source material. With interlaced sourcematerial, the frame rate remains the same but only one field from eachframe is used. This processing results in a progressive sequence of halfthe vertical resolution and the same frame rate.

The progressively scanned frames, signal 31 is coupled to block 40,which generates a lower resolution signal having, for example, theresolution delivered by CCIR Rec. 601. Each Progressively scanned frameis resampled to 720×384 pixels to retain the 16:9 aspect ratio, andpadded with black upper and lower borders to produce a `letter-box`format of 720×480 pixels.

The HDTV signal is now represented by signal 41, having a lower spatialresolution of 720×480 pixels, progressively scanned with a 30 Hz framerate. Signal 41 is coupled to blocks 50, 60, 70 which implementspeed-dependent temporal subsampling. Each trick-play stream isconstructed to have the same temporal resolution or frame holding timeof 2 frames, i.e. every frame will be repeated once. Therefore, at Ntimes trick-play speed, the frame rate is reduced from 30 Hz to 30/2NHz. Thus, the resulting recorded frame rates are as follows, 5× becomes30/10 Hz, 18× becomes 30/36 Hz and 35× becomes 30/70 Hz. Since everyframe is presented twice and the display rate is 30 Hz, the effectivespeed of scene content remains correct at each TP speed.

The temporal subsampling blocks 50, 60, 70, generate output bit streams51, 61 and 71 respectively, which are coupled to respective MPEGencoders 120, 130 and 140 to format MPEG compatible bit streams. Sincethe MPEG compatible encoding is the same for each speed, and because ina pre-recording environment real-time processing is not necessary, thesame MPEG encoding hardware may be used to encode the normal-play streamand each trick-play stream. This commonalty of usage is indicated by thebroken line enclosing the MPEG encoder blocks 100, 120, 130, and 140.The temporally subsampled bit streams 51, 61 and 71 are MPEG encoded asI-frames. Each I-frame is repeated once by employing the DSM₋₋ trick₋₋play₋₋ flag, located in the PES layer as described previously. Theresulting MPEG compatible streams representing normal play speed NP,stream 101, and trick-play speeds of 5×, stream 121, 18×, stream 131 and35×, stream 141, are coupled for record formatting by multiplexor 150.Multiplexor 150 effectively selects between the various MPEG streams togenerate a sync block format signal 200, suitable for record processingby record replay system 210 and writing to tape 220. As describedearlier, the use of predetermined TP speeds allows speed specific TPdata to be positioned, or recorded, at specific sync block locationswithin recorded tracks. Thus multiplexor 150 formats sync block signal200 to locate speed specific TP I frame data at specific sync blocklocations within the recorded tracks. These specific locationsfacilitate reproduction at the various specific TP speeds.

FIG. 6 is a partial block diagram illustrating a further inventivearrangement of the non-real-time "trick-play" apparatus of FIG. 3. Speedspecifically processed TP signals 51, 61 and 71 are coupled to memories520, 530 and 540 which store the 5 times, 18 times and 35 timesprocessed digital image signals respectively. The original HDTV signal09 is also stored in memory 500. Production of the prerecorded media ortape is facilitated by the sequential selection between the variousstored digital signal sources to form an output signal which is MPEGencoded by encoder 100 and recorded on the media. A multiplexor 150 iscontrolled to select between the various digital signal sources to forman output signal for MPEG encoding. The MPEG encoded signal 200 has thevarious signal components arranged such that a recording may be replayedat normal and trick play speeds. Thus the inventive arrangement of FIG.6 facilitates the non-real-time, and independent derivation of bothnormal play and trick play digital signal sources for encoding as MPEGcompatible bit streams.

FIG. 7 is a partial block diagram illustrating another inventivearrangement of the non-real-time "trick-play" apparatus of FIG. 3. InFIG. 7 both normal play and trick play processed digital signals 09, 51,61 and 71 are coupled for encoding as MPEG compatible bit streams byencoder 100. With non-real-time signal processing and pre-recordedmaterial preparation, signals 09, 51, 61 and 71 may be derivedseparately and individually coupled for MPEG encoding by a singleencoder 100. The individually coded MPEG bit streams 101, 121, 131 and141 are stored in memories 550, 560, 570 and 580 representing normalplay and 5×, 18× and 35× bit streams respectively. Memories 550, 560,570 and 580 produce output signals 501, 521 531 and 541 which arecoupled to multiplexor 150 which is controlled responsive to recorder210 to generate an MPEG compatible record bit stream formatted such asto provide reproduction at normal play speed and at the predetermined"trickplay" speeds.

The exemplary, low spatial resolution TP system illustrated in FIG. 3,and described above, produces trick-play quality significantly higherthan that attainable from real-time derived trick-play streams. Theresults produced may be summarized as follows.

During recording, an independent, I-frame only, low-resolution (720×480pixel) MPEG compatible stream is written to tape for each trick-playspeed.

The actual spatial image size is 720×384 pixels, to retain 16:9 aspectratio, presented in a "letter box" format.

The temporal resolution is effectively 15 frames/second for eachtrick-play speed and produces good to excellent quality which remainsconstant for each speed.

The spatial resolution produced by a 2.0 Mbps data rate and 720×480pixels resolution is good to very good, depending on the complexity ofthe source material.

Overall, the trick-play image quality exhibited with this scheme is veryhigh.

The low-resolution pre-recorded trick-play system shown in FIG. 3 anddescribe above produces good quality spatial images at a relatively hightemporal resolution. However, such a low-resolution method may be usedproviding the advanced television decoder/receiver unit is able supportthe lower resolution display format.

FIG. 4 is an exemplary block diagram of an inventive full-resolution,pre-recorded trick-play stream generation system, providing trick-playspeeds of, 5×, 18× and 35×. As previously discussed, pre-recorded trickplay data stream derivation may be generated from the original,uncompressed, source material. FIG. 4 illustrates the generation ofnormal-play and trick-play bit streams, however these may be generatedindependently of one another, directly from the HDTV source material.Since this system provides full-resolution, no spatial sub-sampling isrequired and hence less processing is required than that shown in FIG.3. Since the original, uncompressed, source material may be used, frameswhich are to be intra-coded may be chosen exactly to suit the trick playspeed, rather than selecting I frames from an encoded stream. Inaddition a constant temporal refresh rate can be maintained, which ismore pleasing to the user.

The original HDTV video signal 09 is shown coupled to MPEG encoder 100which generates an MPEG stream 101 for normal play speed operation.Signal 09 is also coupled for temporal subsampling in blocks 55, 65 and75 respectively. For a trick-play speed of N times, only every Nthsource frame may be utilized for coding. However, depending on a desiredtrade-off between spatial and temporal resolution, the actual framesused for encoding may be closer to every 5Nth or 8Nth frame in order toprovide an acceptable spatial resolution. Hence frame holding times, ortemporal resolution, are similar to those of the real-time,full-resolution system described earlier.

Having selected a frame holding or update time, for example, every 5Nthframe for each N times trick-play speed the HDTV stream, signal 09, istemporally sub-sampled for each TP speed. The 5 times TP stream isderived in block 55 which temporally subsamples by a factor of 1/5N, or1/25, i.e. 1 frame in 25 is selected to generate output signal 56.Similarly, the 18 times TP stream is derived in block 65, whichtemporally sub-samples by a factor of 1/5N, or 1/90 and generates outputsignal 66. The 35 times TP stream is derived in block 75, whichtemporally sub-samples by a factor of 1/5N, or 1/75 and generates outputsignal 76. The three sub-sampled TP bit stream signals, 56, 66 and 76are coupled for MPEG encoding in encoder blocks 120, 130 and 140respectively.

Since MPEG compatible encoding is the same for each speed, and becausereal-time processing is not necessary in a pre-recording environment,the same MPEG encoding hardware may be used to encode the normal-playstream and each trick-play stream. This commonalty of usage is indicatedby the broken line enclosing the MPEG encoder blocks 100, 120, 130, and140. The temporally subsampled bit streams 56, 66 and 76 are MPEGencoded as I-frames. Because the frame update time is constantthroughout each trick-play stream, so is the number of bits allocatedfor each I-frame. The frame holding times, or I-frame repeats mayimplemented by employing the DSM₋₋ trick₋₋ play₋₋ flag as previouslydescribed. The resulting MPEG transport streams representing normal playspeed NP, stream 101, and trick-play speeds of 5×, stream 121, 18×,stream 131 and 35×, stream 141, are coupled for record formatting bymultiplexor 150. Multiplexor 150 effectively selects between the variousMPEG streams to generate a sync block format signal 200, suitable forrecord processing by record replay system 210 and writing to tape 220.As previously described, predetermined TP speeds allow speed specific TPdata to be positioned, or recorded, at specific locations withinrecorded tracks. Thus multiplexor 150 formats sync block signal 200 tolocate speed specific TP I frame data at specific sync block locationswhich facilitate reproduction at the various specific TP speeds.

The inventive arrangements of FIGS. 6 and 7 may also be applied thenon-real-time "trick-play" generation arrangement of FIG. 4. As has beendescribed, the arrangements of FIGS. 6 and 7 may facilitate theindependent derivation of normal play and trick play digital signals forsubsequent formatting and MPEG encoding for pre-recorded tape productionor user controlled video on demand service.

The constraints of retaining full spatial and temporal resolution,result in a trick-play quality which is very similar to that achieved bythe full-resolution real-time method. However, this pre-recording methodhas an advantage that the frame holding time is constant. The trick-playstream generation technique described provides trick-play speeds of 5×,18× and 35×, having full spatial resolution, and an effective trick-playbit rate of 2.0 Mbps. The performance may be summarized as follows:

During recording, an independent, I-frame only, MPEG stream is writtento tape for each trick-play speed.

The spatial resolution is the same as the source material.

The temporal resolution is fixed having a 5 frame holding time.

Each I-frame uses all DC and some AC coefficients.

The overall spatial quality is fair. Recovered trick-play images arerecognizable and are acceptable for tape search purposes.

The following table summarizes trick-play quality achieved by thevarious inventive methods disclosed.

    ______________________________________           REAL-TIME TRICK                         NON-REAL-TIME           PLAY STREAM   TRICK PLAY STREAM           GENERATION    GENERATION    ______________________________________    FULL     SPATIAL QUALITY:                             SPATIAL QUALITY:    RESOLUTION             poor to fair, only DC                             poor to fair, DC &    TRICK PLAY             coefficients used.                             some AC coefficients    MODES                    used.             TEMPORAL QUALITY:                             TEMPORAL QUALITY:             poor to acceptable,                             poor to acceptable,             variable 5-8 frame                             constant 5 frame             holding times   holding time.    LOW      SPATIAL QUALITY:                             SPATIAL QUALITY:    RESOLUTION             poor to good,   good to very good,    TRICK PLAY             depends on material,                             depends on material,    MODES    patchwork of MBs             used.             TEMPORAL QUALITY:                             TEMPORAL QUALITY:             good, constant 3                             very good, constant 2             frame holding time.                             frame holding time.    ______________________________________

In view of the constraints discussed previously, the highest trick-playquality may be achieved, in both real-time and pre-recorded material, bythe use of lower-resolution trick-play data. However, the advancedtelevision receiver/decoder must support the use of a low resolutionmode. If full-resolution trick-play modes are utilized, the qualityprovided may be enhanced by manipulation of various parameters. Forexample, raising the effective bit-rate available for each trick-playspeed, will allow an increase in resolution. However, a minimum bit-rateof approximately 2.0 Mbps is required. If the number of "Trick Play"speeds provided is reduced, for example to two in each direction, thenthe effective bit-rate for each remaining speed may be increased. Theeffective temporal resolution, or number of frame repeats, results fromthe trade-off between temporal and spatial resolution. Hence eitherparameter may be optimized depending on the desired application.

I claim:
 1. A method for generating an MPEG compatible digital imagerepresentative signal for recording which facilitates reproduction atmore than one speed, said method comprising the steps of:a) receiving adata stream comprising an MPEG compatible digital image representativesignal (09); b) decoding said data stream (09) to extract intra-codeddata (71); c) storing predetermined intra-coded data (103) from saidextracted intra-coded data (71) to form an intra-coded frame havingreduced spatial resolution (111); d) periodically selecting saidintra-coded frame from said stored frame having reduced spatialresolution (111); e) sequentially selecting said periodically selectedintra-coded frame (121, 131, 141) and said data stream (10) to form anMPEG compatible bit stream (200); and, f) recording (210) said MPEGcompatible bit stream (200).
 2. The method of claim 1, comprising anadditional step of;selecting from said extracted intra-coded data (71)predetermined intra-coded data by reading a look up table.
 3. The methodof claim 1, comprising an additional step of;selecting from saidextracted intra-coded data (71) predetermined intra-coded data byperforming a predetermined calculation.
 4. The method of claim 1,additionally comprises a step of;selecting said periodical selectedintra-coded frame (111) at a rate related to a predetermined trick playspeed.
 5. The method of claim 4, wherein said additional step furthercomprises a step of;storing said trick play speed determined intra-codedframe to form a trick play speed specific intra-coded frame (121, 131,141).
 6. The method of claim 1, wherein said step d) additionallycomprises a step of;selecting said periodical selected intra-coded frame(111) at a rate related to a predetermined temporal resolution.
 7. Themethod of claim 1, wherein said step e) additionally comprises a stepof;controlling (FMT CTRL) said sequentially selection of saidperiodically selected frame (121,131, 141) to facilitate reproduction ofsaid MPEG compatible bit stream (200) at a replay speed different thannormal replay speed.
 8. The method of claim 1, wherein said sequentiallyselection is controled responsive to a format control signal (FMT CTRL)which includes a control signal (211) from a recorder (210) recordingsaid MPEG compatible bit stream (200).
 9. The method of claim 1, whereinsaid predetermined intra-coded data comprises predetermined intra-codedmacroblocks (MB1, MB2).
 10. The method of claim 9, wherein saidpredetermined intra-coded macroblocks comprise luminance and chrominanceblocks (0, 1, 2, 3, 4, 5,).
 11. The method of claim 9, comprising anadditional step of;copying blocks 0 and 1 to blocks 2 and 3 withinmacroblocks which are field coded.
 12. The method of claim 10, whereinsaid step c) additionally includes a step of;re-encoding predicted DCcoefficients for first blocks of said predetermined macroblock (MB3)from an immediately preceding predetermined macroblock (MB 1).
 13. Themethod of claim 5, comprising an additional step of;repeating said trickplay speed specific intra-coded frame by inserting an empty P frame toreplace said trick play speed specific intra-coded frame (121, 131,141).
 14. The method of claim 5, comprising an additional stepof;setting a DSM₋₋ trick₋₋ play₋₋ flag in a Packetized Elemental Streamlayer of said MPEG compatible signal to repeat said trick play speedspecific intra-coded frame (121, 131, 141).
 15. The method of claim 1,wherein said intra-coded frame having reduced spatial resolution (71)comprises 720×480 pixels.
 16. The method of claim 9, wherein saidintra-coded macroblocks (MB1, MB3) comprise DC discrete cosine transformcoefficients and selected AC discrete cosine transform coefficients. 17.The method of claim 16, wherein said selected AC discrete cosinetransform coefficients are selected in accordance with a predeterminednumber of bits allocated per macroblock.